January 27 + 29, 2025
Important
Before Wednesday, read: Tufte. 1997. Visual and Statistical Thinking: Displays of Evidence for Making Decisions. (Use Google to find it.)
What was Hilary trying to answer in her data collection?
Name two of Hilary’s main hurdles in gathering accurate data.
Which is better: high touch (manual) or low touch (automatic) data collection? Why?
What additional covariates are needed / desired? Any problems with them?
How much data does she need?
Are there any ethical considerations to think about?
Based on https://www.effectivedatastorytelling.com/post/a-deeper-dive-into-lego-bricks-and-data-stories, original source: https://www.linkedin.com/learning/instructors/bill-shander
Yau (2013) gives us nine visual cues, and Wickham (2014) translates them into a language using ggplot2.
Visual Cues: the aspects of the figure where we should focus.
Position (numerical) where in relation to other things?
Length (numerical) how big (in one dimension)?
Angle (numerical) how wide? parallel to something else?
Direction (numerical) at what slope? In a time series, going up or down?
Shape (categorical) belonging to what group?
Area (numerical) how big (in two dimensions)? Beware of improper scaling!
Volume (numerical) how big (in three dimensions)? Beware of improper scaling!
Shade (either) to what extent? how severely?
Color (either) to what extent? how severely? Beware of red/green color blindness.
Coordinate System: rectangular, polar, geographic, etc.
Scale: numeric (linear? logarithmic?), categorical (ordered?), time
Context: in comparison to what (think back to ideas from Tufte)
Visual Cues of Yau (2013):
Position (numerical)
Length (numerical)
Angle (numerical)
Direction (numerical)
Shape (categorical)
Area (numerical)
Volume (numerical)
Shade (either)
Color (either)
Attributes can focus your reader’s attention.1
Make the data stand out
Facilitate comparison
Add information
(Nolan & Perrrett, 2016)
Tufte lists two main motivational steps to working with graphics as part of an argument.
“An essential analytic task in making decisions based on evidence is to understand how things work.”
Making decisions based on evidence requires the appropriate display of that evidence.”
Tufte (1997) Visual and Statistical Thinking: Displays of Evidence for Making Decisions. (Use Google to find it.)
How many aspects of this graph can you point out which are relevant to figuring out that cholera infection was coming from a single pump? Are there any distracting aspects?
Why would the outbreak already have begun to decline before the pump handle was removed?
One of the graphics which was particularly unconvincing in trying to explain that O-rings fail in the cold.